Skip to content

Conversation

@aidanheerdegen
Copy link
Member

Add oasis3-mct v5.2, points at newly created branch clone from upstream GitLab.

Closes #366

@aidanheerdegen
Copy link
Member Author

I added specific CI manifests to add explicit @5.2 version testing, as this isn't the default version.

@aidanheerdegen
Copy link
Member Author

But unfortunately the gcc build is failing. Do those build failures look at all familiar @anton-seaice?

@anton-seaice
Copy link
Contributor

anton-seaice commented Jan 27, 2026

Doesn't mean anything to me

This wasn't tested in #351

nor is it the failure we see normally in esm1.6 (which is gcom) - #342 - although th logs have expired: https://github.com/ACCESS-NRI/access-spack-packages/actions/runs/18839277073/job/53747477537

@aidanheerdegen
Copy link
Member Author

Doesn't mean anything to me

This wasn't tested in #351

Ah good, then it isn't a known regression.

Ok, well I'm happy to remove the gcc test for the time being, but also open to suggestions if there is a simple fix.

@aidanheerdegen
Copy link
Member Author

aidanheerdegen commented Jan 27, 2026

Annoyingly, it built fine with gcc in the spack v1.1 testing

https://github.com/ACCESS-NRI/access-spack-packages/actions/runs/19923566643/job/57122243658?pr=332

Addendum: It didn't build above, it was pulled from a binary cache, but it did build at some point in the past with spack v1.1

@anton-seaice
Copy link
Contributor

@manodeep ?

@anton-seaice
Copy link
Contributor

anton-seaice commented Jan 27, 2026

Looks like a different version which built to me:

 -   5f45mxk  oasis3-mct@stable~deterministic~optimisation_report build_system=makefile commit=41b67f418ec4fef537b952dda584d5eeed310ff8 platform=linux os=rocky8 target=x86_64 %c,[email protected]

e.g. @stable

@aidanheerdegen
Copy link
Member Author

Well spotted. I've set the CI to test against @stable and @5.2 but you're right, it is the @5.2 build that is failing first (bemqxbe)

I'll remove the @5.2 from the gcc CI and see what happens ...

@aidanheerdegen
Copy link
Member Author

aidanheerdegen commented Jan 27, 2026

Ok, so 5.2 does not build with gcc right now, thanks for the hint @anton-seaice

@manodeep
Copy link
Collaborator

After some futzing around to get v5.2 to start compiling with gcc, here's the actual error I get:

/g/data/tm70/ms2335/oneapi_compilation_dir/test-upstream-oa3-mct-with-gcc/oasis3-mct-OASIS3-MCT_5.2/lib/scrip/src/remap_bicubic_reduced.F90:916:25:

  908 |           CALL MPI_IRecv(grid1_add_map1(buff_base),&
      |                         2
......
  916 |           CALL MPI_IRecv(wts_map1(:,buff_base),&
      |                         1
Error: Type mismatch between actual argument at (1) and actual argument at (2) (REAL(8)/INTEGER(4)).
/g/data/tm70/ms2335/oneapi_compilation_dir/test-upstream-oa3-mct-with-gcc/oasis3-mct-OASIS3-MCT_5.2/lib/scrip/src/remap_bicubic_reduced.F90:920:25:

  908 |           CALL MPI_IRecv(grid1_add_map1(buff_base),&
      |                         2
......
  920 |           CALL MPI_IRecv(grid2_frac(ila_mpi_mn(ib_i+1)),&
      |                         1
Error: Type mismatch between actual argument at (1) and actual argument at (2) (REAL(8)/INTEGER(4)).
/g/data/tm70/ms2335/oneapi_compilation_dir/test-upstream-oa3-mct-with-gcc/oasis3-mct-OASIS3-MCT_5.2/lib/scrip/src/remap_bicubic_reduced.F90:941:22:

  935 |         CALL MPI_Send(grid1_add_map1,num_links_map1,MPI_INT,&
      |                      2
......
  941 |         CALL MPI_Send(wts_map1,num_wts*num_links_map1,MPI_DOUBLE,&
      |                      1
Error: Type mismatch between actual argument at (1) and actual argument at (2) (REAL(8)/INTEGER(4)).
/g/data/tm70/ms2335/oneapi_compilation_dir/test-upstream-oa3-mct-with-gcc/oasis3-mct-OASIS3-MCT_5.2/lib/scrip/src/remap_bicubic_reduced.F90:944:22:

  935 |         CALL MPI_Send(grid1_add_map1,num_links_map1,MPI_INT,&
      |                      2
......
  944 |         CALL MPI_Send(grid2_frac(ila_mpi_mn(mpi_rank_map+1)),&
      |                      1
Error: Type mismatch between actual argument at (1) and actual argument at (2) (REAL(8)/INTEGER(4)).
make[2]: *** [Makefile:42: remap_bicubic_reduced.o] Error 1

[~/gdata/oneapi_compilation_dir/test-upstream-oa3-mct-with-gcc/oasis3-mct-OASIS3-MCT_5.2/util/make_dir @gadi-login-02]$ ml
Currently Loaded Modulefiles:
 1) gcc/15.1.0   2) openmpi/5.0.8   3) netcdf/4.9.2  

Looks to be an upstream issue that gfortran is picking up (but ifx is ignoring/bypassing)

@anton-seaice
Copy link
Contributor

The cice5 gcc builds have "-fallow-argument-mismatch" if that's any consolation

@manodeep
Copy link
Collaborator

Hmm - not sure I follow why that argument mismatch error is being generated in the first place though. MPI_Send takes a buffer of arbitrary type, and the datatype itself is one of the parameters. Those seem to have been filled out correctly in the function call itself.

From a quick read, StackOverflow opinions seemed to be pretty consistent - "fix your code" - but I don't see what/how to fix. May be the -fallow-argument-mismatch is the way to go towards silencing this error

@manodeep
Copy link
Collaborator

@micaeljtoliveira clarified that we need to add the compiler flag @anton-seaice suggested. Here's our Zulip convo:

MO: gfortran is correct, as one is passing floats and other types to a routine whose arguments are declared as integers. It's a well-known issue, and you need to add the "-fallow-argument-mismatch" flag. To explain this further, the MPI Fortran 90 interface is declared using integers to allow some degree of polymorphims. The MPI Fortran 2008 uses type(*) instead.
MS: Not sure I follow - based on this man page, the first argument of MPI_Send is declared as type void * Is the Fortran declaration different in the implementation? Or, may be, more importantly, is there a void type in Fortran?
MO: Yes, on the C side it's a void *, but there's nothing equivalent in Fortran 90. In practice it doesn't matter what you declared on the Fortran side, because the compiler will simply pass the argument by reference. The compiler is complaining, because there's a Fortran interface for that routine (which is external), that says it's an integer. So the compiler issues an error. Note that until gfortran 11 or 12, this was a warning instead.

Adding -fallow-argument-mismatch allows the compilation to progress but there is yet another error.

[/g/data/tm70/ms2335/oneapi_compilation_dir/test-upstream-oa3-mct-with-gcc/oasis3-mct-OASIS3-MCT_5.2/compile_oa3-mct/build/lib/psmile.MPI1 @gadi-login-01]$ mpifort -fdefault-real-8 -fallow-argument-mismatch   -g3 -O2  -march=cascadelake -mtune=sapphirerapids  -I/g/data/tm70/ms2335/oneapi_compilation_dir/test-upstream-oa3-mct-with-gcc/oasis3-mct-OASIS3-MCT_5.2/compile_oa3-mct/build/lib/psmile.MPI1 -I/g/data/tm70/ms2335/oneapi_compilation_dir/test-upstream-oa3-mct-with-gcc/oasis3-mct-OASIS3-MCT_5.2/compile_oa3-mct/build/lib/pio  -I/g/data/tm70/ms2335/oneapi_compilation_dir/test-upstream-oa3-mct-with-gcc/oasis3-mct-OASIS3-MCT_5.2/compile_oa3-mct/build/lib/mct -Duse_netCDF -Duse_comm_MPI1 -DTREAT_OVERLAY -I/apps/netcdf/4.9.2/include/GNU -I/g/data/tm70/ms2335/oneapi_compilation_dir/test-upstream-oa3-mct-with-gcc/oasis3-mct-OASIS3-MCT_5.2/lib/psmile/include -I/g/data/tm70/ms2335/oneapi_compilation_dir/test-upstream-oa3-mct-with-gcc/oasis3-mct-OASIS3-MCT_5.2/compile_oa3-mct/build/lib/mct -I/g/data/tm70/ms2335/oneapi_compilation_dir/test-upstream-oa3-mct-with-gcc/oasis3-mct-OASIS3-MCT_5.2/compile_oa3-mct/build/lib/scrip  -c   /g/data/tm70/ms2335/oneapi_compilation_dir/test-upstream-oa3-mct-with-gcc/oasis3-mct-OASIS3-MCT_5.2/lib/psmile/src/mod_oasis_load_balancing.F90
f951: Warning: Nonexistent include directory ‘/g/data/tm70/ms2335/oneapi_compilation_dir/test-upstream-oa3-mct-with-gcc/oasis3-mct-OASIS3-MCT_5.2/compile_oa3-mct/build/lib/pio’ [-Wmissing-include-dirs]
/g/data/tm70/ms2335/oneapi_compilation_dir/test-upstream-oa3-mct-with-gcc/oasis3-mct-OASIS3-MCT_5.2/lib/psmile/src/mod_oasis_load_balancing.F90:1024:96:

 1024 |                call oasis_mpi_send(DBLE(lb_diag_buff),0,lb_tag,mpi_comm_global,'oasis_lb_print')
      |                                                                                                1
Error: There is no specific subroutine for the generic ‘oasis_mpi_send’ at (1)

I can see that there is a generic, public mod_oasis_send in mod_oasis_mpi.F90 - perhaps this is a case of REAL(4) vs REAL(8) and/or a lack of a suitable routine?

As a sanity check, I confirmed that I could still compile that source with ifx.

@anton-seaice
Copy link
Contributor

The full set of cice5 flags is:

set(CMAKE_Fortran_FLAGS_RELEASE "-Wall -fdefault-real-8 -fdefault-double-8 -ffpe-trap=invalid,zero,overflow -fallow-argument-mismatch")

So possible the default real 8 could help ? I don't understand why though

@anton-seaice
Copy link
Contributor

Are their build flags included with oasis3-mct? It's unclear to me!

@manodeep
Copy link
Collaborator

@anton-seaice That did the trick - compiling with gfortran succeeds with -fdefault-real-8 -fdefault-double-8 -fallow-argument-mismatch (as well as the full set of CICE5 flags that you had posted).

FWIW, I checked that upstream v6 also compiles with these GCC flags.

Sidenote: v6 seems to have fixed the issue of massive binary files in the upstream repo - v6 tarball is 63MB vs 239MB for v5.

@manodeep
Copy link
Collaborator

Are their build flags included with oasis3-mct? It's unclear to me!

You are supposed to create your own include file that defines the compiler flags, and the NETCDF include and link lines, e.g., make.nci

@aidanheerdegen
Copy link
Member Author

Wow, so much activity. Thanks for the sleuthing. Could we solve this by injecting the required flags from the CI build manifest?

@aidanheerdegen
Copy link
Member Author

My reasoning for this approach: we're not actually deploying with gcc, but it is useful to test against multiple compiler families to pick up possible issues (like the ones above!). So injecting in the CI manifest seems a reasonable compromise.

@manodeep
Copy link
Collaborator

I am in favour of going hard in the CI with different compilers and code quality check flags, and failing the build for specific code-smell issues. At the same time, it would be convenient to see the build logs when the build fails - e.g., COMP.log and COMP.err in the repo/util/make_dir in this specific case - that would make it easier to debug and fix.

@aidanheerdegen
Copy link
Member Author

I am in favour of going hard in the CI with different compilers and code quality check flags, and failing the build for specific code-smell issues

Me too, but it needs to be implemented with care. Feel free to add what you want. But any PR needs to pass before it gets merged. We can't have routine fails of CI tests being ignored.

Also I'd hold off until after the spack v1 merger, as it would be way too much of a headache to port all the changes to the spack v1.x PR.

At the same time, it would be convenient to see the build logs when the build fails - e.g., COMP.log and COMP.err in the repo/util/make_dir in this specific case - that would make it easier to debug and fix.

Can you check that an issue doesn't already exist for this, and if not make one?

https://github.com/ACCESS-NRI/build-ci/issues

@anton-seaice
Copy link
Contributor

The minor annoyance in adding build flags to the spack manifest, is we will have to repeat it in the manifests in the access-nri/oasis3-mct repo

@aidanheerdegen
Copy link
Member Author

The minor annoyance in adding build flags to the spack manifest, is we will have to repeat it in the manifests in the access-nri/oasis3-mct repo

For build CI? Yeah I suppose. It didn't work anyway.

I am disabling gcc checks in the meantime. It seems v2 (ESM1.5) doesn't compile with gcc either.

@manodeep
Copy link
Collaborator

For build CI? Yeah I suppose. It didn't work anyway.

Would have been good to know why the build flags that fixed the compile for me do not fix the compile in the CI. But agree that disabling the gcc checks for now is the way to go.

And to clarify, when I say to go hard in the CI, I mean on the repo CI, i.e., access-nri/oasis3-mct.

anton-seaice
anton-seaice previously approved these changes Jan 28, 2026
Copy link
Contributor

@anton-seaice anton-seaice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fine - I defer to @harshula and @manodeep

You could add a version for the v5 and v6 branches to use for build-CI

It looks like line 76 of package.py needs updating:

       oasis_version = "2.0"
       if self.spec.satisfies("@upstream,OASIS3-MCT_5.2"):
           oasis_version = "5"

@aidanheerdegen
Copy link
Member Author

It looks like line 76 of package.py needs updating

Thanks. Did not check properly for any version logic.

@aidanheerdegen
Copy link
Member Author

You could add a version for the v5 and v6 branches to use for build-CI

I could, but I'm looking for the smallest possible change at this point and we're not using v6 currently. I'll leave that to others if it is required.

@aidanheerdegen
Copy link
Member Author

It looks like line 76 of package.py needs updating

Thanks. Did not check properly for any version logic.

Ok, turned on the SSH access and confirmed for

==> [2026-01-29-01:05:11.715688] oasis3-mct: Building oasis3-mct-5.2-xtnoadmh7563cprbnd4ik4t33rxfibs6 [MakefilePackage]

version is correctly set:

 cat oasis3-mct.pc 
prefix=/opt/release/linux-rocky8-x86_64/intel-2021.10.0/oasis3-mct-5.2-xtnoadmh7563cprbnd4ik4t33rxfibs6
exec_prefix=${prefix}
libdir=${exec_prefix}/lib
includedir=${prefix}/include

Name: mct
Description: OASIS3-MCT 5 mct Library for Fortran
Version: 5
Libs: -L${libdir} -lmct
Cflags: -I${includedir}/mct

@harshula
Copy link
Collaborator

FYI, oasis3-mct build error log was at /tmp/root/spack-stage/spack-stage-oasis3-mct-5.2-oyzvcv2micwvodnaudmhzcfws6vso2pj/spack-src/util/make_dir/COMP.err in my Docker instance.

@harshula
Copy link
Collaborator

Hi @aidanheerdegen , Do #367 (comment) . i.e. Modify the oasis3-mct SPR.

Copy link
Collaborator

@harshula harshula left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, please remember to do a squash merge.

@aidanheerdegen aidanheerdegen merged commit b9d1522 into main Jan 30, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Update oasis3-mct package with latest version on ACCESS repo

5 participants